14 research outputs found

    A survey of the European Open Science Cloud services for expanding the capacity and capabilities of multidisciplinary scientific applications

    Get PDF
    Open Science is a paradigm in which scientific data, procedures, tools and results are shared transparently and reused by society. The European Open Science Cloud (EOSC) initiative is an effort in Europe to provide an open, trusted, virtual and federated computing environment to execute scientific applications and store, share and reuse research data across borders and scientific disciplines. Additionally, scientific services are becoming increasingly data-intensive, not only in terms of computationally intensive tasks but also in terms of storage resources. To meet those resource demands, computing paradigms such as High-Performance Computing (HPC) and Cloud Computing are applied to e-science applications. However, adapting applications and services to these paradigms is a challenging task, commonly requiring a deep knowledge of the underlying technologies, which often constitutes a general barrier to its uptake by scientists. In this context, EOSC-Synergy, a collaborative project involving more than 20 institutions from eight European countries pooling their knowledge and experience to enhance EOSC’s capabilities and capacities, aims to bring EOSC closer to the scientific communities. This article provides a summary analysis of the adaptations made in the ten thematic services of EOSC-Synergy to embrace this paradigm. These services are grouped into four categories: Earth Observation, Environment, Biomedicine, and Astrophysics. The analysis will lead to the identification of commonalities, best practices and common requirements, regardless of the thematic area of the service. Experience gained from the thematic services can be transferred to new services for the adoption of the EOSC ecosystem framework. The article made several recommendations for the integration of thematic services in the EOSC ecosystem regarding Authentication and Authorization (federated regional or thematic solutions based on EduGAIN mainly), FAIR data and metadata preservation solutions (both at cataloguing and data preservation—such as EUDAT’s B2SHARE), cloud platform-agnostic resource management services (such as Infrastructure Manager) and workload management solutions.This work was supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 857647, EOSC-Synergy, European Open Science Cloud - Expanding Capacities by building Capabilities. Moreover, this work is partially funded by grant No 2015/24461-2, São Paulo Research Foundation (FAPESP). Francisco Brasileiro is a CNPq/Brazil researcher (grant 308027/2020-5).Peer Reviewed"Article signat per 20 autors/es: Amanda Calatrava, Hernán Asorey, Jan Astalos, Alberto Azevedo, Francesco Benincasa, Ignacio Blanquer, Martin Bobak, Francisco Brasileiro, Laia Codó, Laura del Cano, Borja Esteban, Meritxell Ferret, Josef Handl, Tobias Kerzenmacher, Valentin Kozlov, Aleš Křenek, Ricardo Martins, Manuel Pavesio, Antonio Juan Rubio-Montero, Juan Sánchez-Ferrero "Postprint (published version

    BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data.

    Get PDF
    Molecular dynamics simulation (MD) is, just behind genomics, the bioinformatics tool that generates the largest amounts of data, and that is using the largest amount of CPU time in supercomputing centres. MD trajectories are obtained after months of calculations, analysed in situ, and in practice forgotten. Several projects to generate stable trajectory databases have been developed for proteins, but no equivalence exists in the nucleic acids world. We present here a novel database system to store MD trajectories and analyses of nucleic acids. The initial data set available consists mainly of the benchmark of the new molecular dynamics force-field, parmBSC1. It contains 156 simulations, with over 120 of total simulation time. A deposition protocol is available to accept the submission of new trajectory data. The database is based on the combination of two NoSQL engines, Cassandra for storing trajectories and MongoDB to store analysis results and simulation metadata. The analyses available include backbone geometries, helical analysis, NMR observables and a variety of mechanical analyses. Individual trajectories and combined meta-trajectories can be downloaded from the portal. The system is accessible through http: //mmb.irbbarcelona.org/BIGNASim/. Supplementary Material is also available on-line at http://mmb. irbbarcelona.org/BIGNASim/SuppMaterial/.Spanish Ministry of Science [BIO2012-32868, SEV-2011-00067, TIN2012-34557]; Catalan Government [2014-SGR-134, 2014-SGR-1051]; Institut Català de Recerca I Estudis Avanc¸ats, ICREA Academia [to M.O.], Instituto de Salud Carlos III-Instituto Nacional de Bioinformática [PT13/0001/0019, PT13/0001/0028]; European Research Council [ERC SimDNA]; European Union, H2020 programme [Elixir-Excellerate: 676559; BioExcel: 674728, MuG: 676566]; PEDECIBA and SNI (ANII, Uruguay) [to P.D.D.]. Funding for open access charge: European Union [MuG: 676566].Peer ReviewedPostprint (published version

    A survey of the European Open Science Cloud services for expanding the capacity and capabilities of multidisciplinary scientific applications

    Get PDF
    Open Science is a paradigm in which scientific data, procedures, tools and results are shared transparently and reused by society as a whole. The initiative known as the European Open Science Cloud (EOSC) is an effort in Europe to provide an open, trusted, virtual and federated computing environment to execute scientific applications, and to store, share and re-use research data across borders and scientific disciplines. Additionally, scientific services are becoming increasingly data-intensive, not only in terms of computationally intensive tasks but also in terms of storage resources. Computing paradigms such as High Performance Computing (HPC) and Cloud Computing are applied to e-science applications to meet these demands. However, adapting applications and services to these paradigms is not a trivial task, commonly requiring a deep knowledge of the underlying technologies, which often constitutes a barrier for its uptake by scientists in general. In this context, EOSC-SYNERGY, a collaborative project involving more than 20 institutions from eight European countries pooling their knowledge and experience to enhance EOSC\u27s capabilities and capacities, aims to bring EOSC closer to the scientific communities. This article provides a summary analysis of the adaptations made in the ten thematic services of EOSC-SYNERGY to embrace this paradigm. These services are grouped into four categories: Earth Observation, Environment, Biomedicine, and Astrophysics. The analysis will lead to the identification of commonalities, best practices and common requirements, regardless of the thematic area of the service. Experience gained from the thematic services could be transferred to new services for the adoption of the EOSC ecosystem framework

    Computational Infrastructures for biomolecular research

    Get PDF
    [eng] Recently, research processes in Life sciences have evolved at a rapid pace. This evolution, mainly due to technological advances, offers more powerful equipment and generalizes the digital format of research data. In the data deluge context, we need to overcome the current tsunami of data and prepare for the future. The current model, consisting to regularly add hardware resources into centralized core facilities without global coordination, is no longer sustainable. Scientific data management and analysis should be enhanced in order to offer services and developments corresponding to the new e-Science uses, and infrastructures are the vehicles to achieve so. We propose and implement research support infrastructures in line with new science directives, adapting them to the scenarios presented by the divergent use cases. Three different domain-specific infrastructures framed in three different scientific projects are assembly and introduced in this dissertation. The first case is framed in the clinical data management field, and focuses on the data platforms build around two epidemiologic case studies on Immune Mediated Inflammatory diseases (IMIDs), IMID-clinica and IMID-longitudinal. Making the leap to infrastructures more oriented to analysis process support, the transPLANT infrastructure represents a first intrusion into the topical cloud computing model. It is focused on plant genomics and its design became the seed for a more integrative cloud-based solution, this time developed for the non-programmer’s members of the 3D/4D genomics community. MuGVRE is the front cover of the resulting platform. Becoming obvious the transversal potential of cloud-based computational infrastructures as virtual research environments, openVRE is implemented as an abstraction of MuGVRE. It offers a vanilla platform encompassing computation, data and administration services ready to be adopted and customized by other scientific communities. They all represent an opportunity to establish better research processes through enhanced collaboration, data management, analysis practices and resources optimization

    Computational Infrastructures for biomolecular research

    No full text
    Recently, research processes in Life sciences have evolved at a rapid pace. This evolution, mainly due to technological advances, offers more powerful equipment and generalizes the digital format of research data. In the data deluge context, we need to overcome the current tsunami of data and prepare for the future. The current model, consisting to regularly add hardware resources into centralized core facilities without global coordination, is no longer sustainable. Scientific data management and analysis should be enhanced in order to offer services and developments corresponding to the new e-Science uses, and infrastructures are the vehicles to achieve so. We propose and implement research support infrastructures in line with new science directives, adapting them to the scenarios presented by the divergent use cases. Three different domain-specific infrastructures framed in three different scientific projects are assembly and introduced in this dissertation. The first case is framed in the clinical data management field, and focuses on the data platforms build around two epidemiologic case studies on Immune Mediated Inflammatory diseases (IMIDs), IMID-clinica and IMID-longitudinal. Making the leap to infrastructures more oriented to analysis process support, the transPLANT infrastructure represents a first intrusion into the topical cloud computing model. It is focused on plant genomics and its design became the seed for a more integrative cloud-based solution, this time developed for the non-programmer’s members of the 3D/4D genomics community. MuGVRE is the front cover of the resulting platform. Becoming obvious the transversal potential of cloud-based computational infrastructures as virtual research environments, openVRE is implemented as an abstraction of MuGVRE. It offers a vanilla platform encompassing computation, data and administration services ready to be adopted and customized by other scientific communities. They all represent an opportunity to establish better research processes through enhanced collaboration, data management, analysis practices and resources optimization

    BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data.

    No full text
    Molecular dynamics simulation (MD) is, just behind genomics, the bioinformatics tool that generates the largest amounts of data, and that is using the largest amount of CPU time in supercomputing centres. MD trajectories are obtained after months of calculations, analysed in situ, and in practice forgotten. Several projects to generate stable trajectory databases have been developed for proteins, but no equivalence exists in the nucleic acids world. We present here a novel database system to store MD trajectories and analyses of nucleic acids. The initial data set available consists mainly of the benchmark of the new molecular dynamics force-field, parmBSC1. It contains 156 simulations, with over 120 of total simulation time. A deposition protocol is available to accept the submission of new trajectory data. The database is based on the combination of two NoSQL engines, Cassandra for storing trajectories and MongoDB to store analysis results and simulation metadata. The analyses available include backbone geometries, helical analysis, NMR observables and a variety of mechanical analyses. Individual trajectories and combined meta-trajectories can be downloaded from the portal. The system is accessible through http: //mmb.irbbarcelona.org/BIGNASim/. Supplementary Material is also available on-line at http://mmb. irbbarcelona.org/BIGNASim/SuppMaterial/.Spanish Ministry of Science [BIO2012-32868, SEV-2011-00067, TIN2012-34557]; Catalan Government [2014-SGR-134, 2014-SGR-1051]; Institut Català de Recerca I Estudis Avanc¸ats, ICREA Academia [to M.O.], Instituto de Salud Carlos III-Instituto Nacional de Bioinformática [PT13/0001/0019, PT13/0001/0028]; European Research Council [ERC SimDNA]; European Union, H2020 programme [Elixir-Excellerate: 676559; BioExcel: 674728, MuG: 676566]; PEDECIBA and SNI (ANII, Uruguay) [to P.D.D.]. Funding for open access charge: European Union [MuG: 676566].Peer Reviewe

    Nucleosome Dynamics: a new tool for the dynamic analysis of nucleosome positioning

    Get PDF
    We present Nucleosome Dynamics, a suite of programs integrated into a virtual research environment and created to define nucleosome architecture and dynamics from noisy experimental data. The package allows both the definition of nucleosome architectures and the detection of changes in nucleosomal organization due to changes in cellular conditions. Results are displayed in the context of genomic information thanks to different visualizers and browsers, allowing the user a holistic, multidimensional view of the genome/transcriptome. The package shows good performance for both locating equilibrium nucleosome architecture and nucleosome dynamics and provides abundant useful information in several test cases, where experimental data on nucleosome position (and for some cases expression level) have been collected for cells under different external conditions (cell cycle phase, yeast metabolic cycle progression, changes in nutrients or difference in MNase digestion level). Nucleosome Dynamics is a free software and is provided under several distribution models.M.O. is an ICREA (Institució Catalana de Recerca i Estudis Avancats) academia researcher; Spanish Ministry of Science [RTI2018-096704-B-100]; Catalan Government [2017-SGR-134]; Instituto de Salud Carlos III–Instituto Nacional de Bioinformática, the European Union's Horizon 2020 research and innovation program, and the Biomolecular and Bioinformatics Resources Platform [ISCIII PT 17/0009/0007 co-funded by the Fondo Europeo de Desarrollo Regional FEDER; Grants Elixir-Excelerate: 676559 and BioExcel2: 823830; ERC:812850; MuG-676566]; MINECO Severo Ochoa Award of Excellence from the Government of Spain (awarded to IRB Barcelona). Funding for open access charge: Spanish Ministry of Science [RTI2018-096704-B-100]

    Nucleosome Dynamics: A new tool for the dynamic analysis of nucleosome positioning

    Get PDF
    We present Nucleosome Dynamics, a suite of programs integrated into a virtual research environment and created to define nucleosome architecture and dynamics from noisy experimental data. The package allows both the definition of nucleosome architectures and the detection of changes in nucleosomal organization due to changes in cellular conditions. Results are displayed in the context of genomic information thanks to different visualizers and browsers, allowing the user a holistic, multidimensional view of the genome/transcriptome. The package shows good performance for both locating equilibrium nucleosome architecture and nucleosome dynamics and provides abundant useful information in several test cases, where experimental data on nucleosome position (and for some cases expression level) have been collected for cells under different external conditions (cell cycle phase, yeast metabolic cycle progression, changes in nutrients or difference in MNase digestion level). Nucleosome Dynamics is a free software and is provided under several distribution models

    Identification of IRX1 as a Risk Locus for Rheumatoid Factor Positivity in Rheumatoid Arthritis in a Genome-Wide Association Study.

    No full text
    Rheumatoid factor (RF) is a well-established diagnostic and prognostic biomarker in rheumatoid arthritis (RA). However, ∼20% of RA patients are negative for this anti-IgG antibody. To date, only variation at the HLA-DRB1 gene has been associated with the presence of RF. This study was undertaken to identify additional genetic variants associated with RF positivity. A genome-wide association study (GWAS) for RF positivity was performed using an Illumina Quad610 genotyping platform. A total of 937 RF-positive and 323 RF-negative RA patients were genotyped for >550,000 single-nucleotide polymorphisms (SNPs). Association testing was performed using an allelic chi-square test implemented in Plink software. An independent cohort of 472 RF-positive and 190 RF-negative RA patients was used to validate the most significant findings. In the discovery stage, a SNP in the IRX1 locus on chromosome 5p15.3 (SNP rs1502644) showed a genome-wide significant association with RF positivity (P = 4.13 × 10(-8) , odds ratio [OR] 0.37 [95% confidence interval (95% CI) 0.26-0.53]). In the validation stage, the association of IRX1 with RF was replicated in an independent group of RA patients (P = 0.034, OR 0.58 [95% CI 0.35-0.97] and combined P = 1.14 × 10(-8) , OR 0.43 [95% CI 0.32-0.58]). To our knowledge, this is the first GWAS of RF positivity in RA. Variation at the IRX1 locus on chromosome 5p15.3 is associated with the presence of RF. Our findings indicate that IRX1 and HLA-DRB1 are the strongest genetic factors for RF production in RA

    A genome-wide association study identifies SLC8A3 as a susceptibility locus for ACPA-positive rheumatoid arthritis.

    No full text
    RA patients with serum ACPA have a strong and specific genetic background. The objective of the study was to identify new susceptibility genes for ACPA-positive RA using a genome-wide association approach. A total of 924 ACPA-positive RA patients with joint damage in hands and/or feet, and 1524 healthy controls were genotyped in 582 591 single-nucleotide polymorphisms (SNPs) in the discovery phase. In the validation phase, the most significant SNPs in the genome-wide association study representing new candidate loci for RA were tested in an independent cohort of 863 ACPA-positive patients with joint damage and 1152 healthy controls. All individuals from the discovery and validation cohorts were Caucasian and of Southern European ancestry. In the discovery phase, 60 loci not previously associated with RA risk showed evidence for association at P SLC8A3 was identified as a new risk locus for ACPA-positive RA. This study demonstrates the advantage of analysing relevant subsets of RA patients to identify new genetic risk variants
    corecore